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Abstract. A language L over a finite alphabet X is growth-sensitive (or entropy sensitive) 
if forbidding any finite set of factors F of L yields a sub-language L F whose exponential 
growth rate (entropy) is smaller than that of L. Let (X, E, I) be an infinite, oriented, edge- 
labelled graph with label alphabet S. Considering the graph as an (infinite) automaton, we 
associate with any pair of vertices x, y € X the language L x>y consisting of all words that 
can be read as the labels along some path from x to y. Under suitable, general assumptions 
we prove that these languages are growth-sensitive. This is based on using Markov chains 
with forbidden transitions. 



1. Introduction 

Let I] be a finite alphabet and S* the set of all finite words over S, including the empty 
word e. A language L over XI is a subset of S*. All our languages will be infinite. We denote 
by \w\ the length of the word w. A factor of a word w = ... a n is a word of the form 
ajOj+i . . . a,j, with 1 < % < j < n. The growth or entropy of L is 

h(L) = limsup — log|{u> G L : |w|=n}|. 

For a finite, non-empty set F C S + = S* \ {e} consisting of factors of elements of L, we let 

L F = {w E L : no v E F is a factor of w}. 

The issue addressed here is to provide conditions under which, for a class of languages asso- 
ciated with infinite graphs, h(L F ) < h(L). If this holds for any set F of forbidden factors, 
then the language L is called growth sensitive (or entropy sensitive). 
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Questions related with growth sensitivity have been considered in different context. 

In group theory, in relation with regular normal forms of finitely generated groups, the study 
of growth-sensitivity has been proposed by Grigorchuk and de la Harpe [9] as a tool 
for proving Hopfianity of a given group or class of groups, see also Arzhantseva and 
Lysenok pQ and Ceccherini-Silberstein and Scarabotti [4]. 

In symbolic dynamics, the number h(L) associated with a regular language accepted by a 
finite automaton with suitable properties appears as the topological entropy of a sofic system, 
see Lind and Marcus [HI Chapters 3 & 4]. Entropy sensitivity appears as the strict 
inequality between the entropies of an irreducible sofic shift and a proper subshift [TTJ Cor. 
4.4.9]. 

Motivated by these bodies of work, Ceccherini-Silberstein and Woess [6], [7], [5] have 
elaborated practicable criteria that guarantee growth-sensitivity of context-free languages. 

The main result of the present note can be seen as a direct extension of [TTJ Cor. 4.4.9] to 
the entropies of infinite sofic systems; see below for further comments and references. 

Our basic object is an infinite oriented graph (X, E, £) whose edges are labelled by elements 
of a finite alphabet S. Each edge has the form e = (x, a, y), where e~ = x and e + = y G X 
are the initial and the terminal vertex of e and £(e) = a G S is its label. We will also write 
x A y for the edge e = (x, a, y), or just x — > y in situations where we do not care about the 
label. Multiple edges and loops are allowed, but two edges with the same end vertices must 
have distinct labels. 

A path of length n in (X, E, £) is a sequence 7r = e±e2 ■ ■ ■ e n of edges such that ef = e~ +1 , for 
i = 1, 2, ... n — 1. We say that it is a path from x to y, if e~{ = x and e+ = y. The label l(ir) 
of 7r is the word £(ir) = £(e\)£(e2) ■ ■ .^(e n ) G S* that we read along the path. We also allow 
the empty path from x to x, whose label is the empty word e G S*. For x, y G X, denote by 
Tl x , y the set of all paths ir from x to y in (X, E,£). 

The languages which we consider here are 

L x , y = G S* : 7r G n^}, where x, y G X. 

That is, we can interpret the edge-labelled graph (X, E, £) as an infinite automaton (labelled 
digraph) with initial state x and terminal state y, so that L x>y is the language accepted by 
the automaton. 

We say that (X,E,£) is deterministic, if for every vertex x and every a G S, there is at 
most one edge with initial point x and label a. Any automaton (finite or infinite) can be 
transformed into a deterministic one that accepts the same language, by the well known 
powerset construction. See e.g. [2j Prop. 1.4.1]. 
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As in the finite case, we need an irreducibility assumption. The graph (X, E, £) is called 
strongly connected, if for every pair of vertices x, y, there is an (oriented) path from x to y. 
Furthermore, we say that it is uniformly connected, if in addition the following holds. 

• There is a constant K such that for very edge x — > y there is a path from y to x with 
length at most K. 

In the finite case, the two notions coincide as one can take K = \X\. The forward distance 
d + (x, y) of x, y £ X is the minimum length of a path from x to y. We write 

h(X) = h(X,E,£)= sup h(L Xj „) 

x,y£X 

and call this the entropy of our oriented, labelled graph. It is a well known and easy to prove 
fact that for a strongly connected graph, \\{L XjV ) = h(X) for all x, y £ X. 

We also need a reasonable assumption on the set of forbidden factors. 

We say that a finite set F C S + is relatively dense in the graph (X, E, £), if there is a constant 
D such that for every x £ X there are y £ X and w & F such that d + (x, y) < D and there 
is a path starting at y which has label w. 

Note that the assumptions of uniformly connectedness and relatively denseness cannot be 
avoided, since they play an important role in the prove of the main result. This fails withous 
this assumptions. 

Theorem 1.1. Suppose that (X,E,£) is uniformly connected and deterministic with label 
alphabet X. Let F C S + be a finite, non-empty set which is relatively dense in (X,E,£). 
Then 

sup h(L^ y ) < h(X) strictly. 

x,y£X 

We say that (X,E,£) is fully deterministic, if for every x £ X and a £ S, there is precisely 
one edge with initial point x and label a. Remark that in automata theory, the classical 
terminalogy is deterministic and complete, instead of fully deterministic. Since in graph 
theory a complete graph is one in which every pair a distinct vertices is connected by an 
unique edge, we shall use the notion of fully deterministic graphs throughout this paper. 

Corollary 1.2. If(X, E, £) is uniformly connected and fully deterministic then L x y is growth- 
sensitive for all x,y £ X . 

Indeed, in this case, for every x £ X and every w £ X*, there is precisely one path with label 
w starting at x. 

With our edge-labelled graph (X,E,£), we can consider the full shift space which consists 
of all bi-infinite words over S that can be read along the edges of some bi-infinite path 
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in (X,E,£). When (X,E,£) is strongly connected, the entropy h(L xy ) is independent of 
x and y and equals the topological entropy of the full shift space of the graph. See e.g. 
Gurevic [10], Petersen [JJ] or Boyle, Guzzi and Gomez [3] for a selection of related 
work and references, and also the discussion in (TTJ §13.9]. 

If we consider the shift space consisting of all those bi-infinite words as above that do not 
contain any factor in F, then the interpretation of Corollary [T72] is that the associated entropy 
is strictly smaller than h(X). 

The theorem, once approached in the right way, is not hard to prove. It is based on a classical 
tool, a version of the Perron-Frobenius theorem for infinite non-negative matrices; see e.g. 
Seneta [16]. We shall first reformulate things in terms of Markov chains and forbidden 
transitions. 

2. Markov chains and forbidden transitions 

We now equip the oriented, edge-labelled graph (X, E, £) with additional data: with each 
edge e = (x,a,y), we associate a probability p(e) = p(x,a,y) > a > 0, where a is a fixed 
constant, such that 

(1) p(e) < 1 for every x G X . 

e£E : e~=x 

Our assumption to have the uniform lower bound p(e) > a for each edge implies that the 
outdegree (number of outgoing edges) of each vertex is bounded by 1/a. We interpret p(e) as 
the probability that a particle with current position x = e~ moves in one (discrete) time unit 
along e to its end vertex y = e + . Observing the successive random positions of the particle 
at the time instants 0, 1, 2, . . . , we obtain a Markov chain with state space X whose one-step 
transition probabilities are 

P(x,y)= ^2 p( x , a ,y)- 

ae£:(x,a,j/)ei? 

We shall also want to record the edges, resp. their labels used in each step, which means to 
consider a Markov chain on a somewhat larger state space, but we will not need to formalise 
this in detail. In ([1]), we admit the possibility that 1 — ^2 y p{x,y) > for some x. This 
number is then interpreted as the probability that a particle positioned at x dies at the next 
step. 

We write y) for the probability that the particle starting at x is at position y af- 

ter n steps. This is the (x, y)-element of the n-power P n of the transition matrix P = 
(p(x,y)) x x . If (X, E,£) is strongly connected, then P is irreducible, and it is well-known 
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that the number 

p{P) = limsupp (n) (x,?/) 1/n 

n— >oo 

is independent of x and y. See once more [16]. Often, p(P) is called the spectral radius of 
P. It is the parameter of exponential decay of the transition probabilities. 

Let once more F C X! + be finite. We interpret the elements of F as sequences of forbidden 
transitions. That is, we restrict the motion of the particle: at no time, it is allowed to traverse 
any path tt with £(ir) G F in k successive steps, where k is the length of it. We write pp \x : y) 
for the probability that the particle starting at x is at position y after n steps, without having 
made any such sequence of forbidden transitions. Let 

Px,v(Pf) = limsupp^x,?/) 1 /™, x,y G X . 

n— >oo 

These numbers are not necessarily independent of x and y, and they are not the elements of 
the n-matrix power of some substochastic matrix. 

Recall that a transition matrix Q = [q(x, y)) xyeX on the state space X is called substochastic 
if there exists a constant e > 0, such that for all x G X 

^q(x,y) < 1 -e. 

yex 

That is, all row sums are bounded by 1 — e. In order to give an upper bound for the restricted 
transition probabilities p^\x, y), we first show the following. 

Lemma 2.1. Suppose that (X,E,l) is strongly connected with label alphabet £ and equipped 
with transition probabilities p(e) > a > 0, e G E. Let F C X! + be a finite, non-empty set 
which is relatively dense in (X,E,£). Then there are fceN and Eq > such that 

Pp (x, y) < 1 — sq for all x G X . 

y&X 

In other words, the transition matrix Q = {p^p\x, y)) x y£X is strictly substochastic, with all 
row sums bounded by 1 — Eq . 

Proof. Let R = max^ e ^ \w\, and let D G N be the constant from the definition of relative 
denseness of F. Set k = D + R. For each x G X, we can find a path Ti\ from x to some 
y G X with length d < D and a path 712 starting at y which has label w G S*. Let z be the 
endpoint of 7r 2 , and choose any path 7r 3 that starts at z and has length k — d — \w\. (Such a 
path exists by strong connectedness.) Then let tt be the path obtained by concatenating 7Ti , 

7T2 and 7T3 . 
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The probability that the Markov chain starting at x makes its first k steps along the edges 
of 7r is 

P(tt) > a k = e > 0. 

Hence 

J>?W) < $> (fc) (x,y) -P(tt) < 1 - 5 , 

yeX y€X 

and this upper bound holds for every x. □ 

The matrix P acts on functions h : X — > K. by Ph(x) = ^2 y p{x,y)h(y). Next, we state two 
key results due to Pruitt [151 Lemma 1] and [151 Corollary to Theorem 2] , which will be used 
in the proof of the main result. 

Lemma 2.2. If the transition matrix P is irreducible and Ph < sh for some s > and 
h ^ 0, then h > 0. 

Lemma 2.3. // the transition matrix P = {p(x,y)} Xty( zx is such that for every x G X the 
entries p(x, y) = for all y G X except finitely many, then the equation 

Ph = sh 

has a solution for all s > p(P). 

Using these lemmatas, we prove the following result on sensitivity of the Markov chain with 
respect to forbidding the transitions in F. 

Theorem 2.4. Suppose that (X,E,£) is uniformly connected with label alphabet X and 
equipped with transition probabilities p(e) > a > 0, e G E. Let F C S + be a finite, non-empty 
set which is relatively dense in (X,E,£). Then 

sup P X)V (Pf) < p(P) strictly. 

x,y£X 

Proof. We shall proceed in two steps. 

Step 1. We assume that P = (p(x , y)) x yi _ x is stochastic and that p(P) = 1. 

Consider the matrix Q of Lemma [2.11 Let Q n = (q( n \x,y)^) x y( _ x be its n-th matrix power. 

y) is the probability that the Markov chain starting at x is in y at time nk and does not 
make any forbidden sequence of transitions in each of the discrete time intervals [(j — l)k, jk] 
for j G {1, . . . , n}. Therefore 

p { ; k \x, y )<q^\x,y), 
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and also, by the same reasoning, for i = 0, . . . ,k — 1, 

P ( F k+t \x, y)<J2 1 {n) ^ z )Pf( z > V) » i — ■ ■ ■ ,k — 1. 

zex 

Therefore, for every i6l and i — 0, . . . , k — 1, 

y€X zeX y£X 

since Lemma 12.11 implies that the row sums of the matrix power Q n are bounded above by 
(1 — e ) n . We conclude that 

limsup^+^y) 1 /^ < (1 -£ ) 1A \ 

n— too 

so that Px, v (Pf) < (1 - £o) 1/fc = 1 — e, where £ > 0. 

5tej» General case. We reduce this case to the previous one. 

Since P is irreducible and every row of P has only finitely many non-zero entries, Lemma 
12.21 and Lemma 12.31 guaranty the existence of a strictly positive solution h : X — > R for the 
equation 

Ph = p(P) ■ h, 

that is, h is p(P) -harmonic. Consider now the /i-transform of the transition probabilities 
p(e) of P, e = (x, a, y) G E, given by 

and the associated transition matrix P h with entries 

P h (x,y)= Yl p h ( x i a iy)- 

a : (x,a,y)eE 

The Markov chain associated with P h is called the h-process. 

Then p(P h ) = 1. Using uniform connectedness, we show that there is a constant a > such 
that p ft (e) > a for each e = (x, a, G P. Indeed, for such an edge, there is k < K such that 
d + (y,x) = k, whence 

p(P) k h(y) = s)/i(z) > <x k h(x) , 

zex 

so that 

p h (x,a,y)> {a/p{P)) k+ \ 
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Recall that K is the constant used in the definition of the uniform connectedness. We can 
now choose a = (a/p(P)) . We see that with P h we are now in the situation of Step 1. 
Thus, forbidding the transitions of F for the Markov chain with transition matrix P h , we get 
px,y(Pp) < 1 — e for all x, y G X, where e > 0. 

We now show that p X) y(Pp) = p X)V (Pf)/ p(P), which will conclude the proof. 

For a path tt = e\ . . . e n from x to y, let (as above) ~P(ir) be the probability that the original 
Markov chain traverses the edges of ir in n successive steps, and let F h (ir) be the analogous 
probability with respect to the /i-process. Then 



Py7r)%) 
p{P) n h{x) ' 



Let us write H™ y (-iF) for the set of all paths tt from x to y with length n for which £(tt) does 
not contain a factor in F. Then the n-step transition probabilities of the /i-process with the 
transitions in F forbidden are 

n h(n) ( F h (7r) _ V- P(tt)%) _ P P(x,y)h(y) 

Taking n-th roots and passing to the upper limit, we obtain the required identity. □ 
With this result, it is now easy to deduce Theorem ll.il 

Proof of Theorem \l.l\ Since (X,E,l) is deterministic with label alphabet XI, the outdegree 
of every x G X is at most Equip the edges of (X, E, t) with the transition probabilities 
p(x, a,y) = 1/| S|, when (x, a, y) G E. Then the n-step transition probabilities of the resulting 
Markov chain are given by 

{n) , _ \{w€L XtV : \w\=n}\ 



p [n) (x,y) 



Therefore, because (X, E, £) is uniformly connected, we have 

h(X) = h(L x>y ) = limsup-log(^(x,y)|Sr) = log(p(P) ■ |S|). 

n— >oo Ti 

Analogously, 

h(Lj y )=log(p a ,, v (P f .).|E|). 

By Theorem 12.41 

sup Px^Pf) < p(P), 

x,y£X 
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and this implies that 

sup h(L* y ) < h(X) 

x,y£X 

strictly. □ 
Application to pairs of groups and their Schreier graphs 

Let G be a finitely generated group and K a (not necessary finitely generated) subgroup. 
Let also 5] be a finite alphabet and ip : £ — > G be such that the set ^(S) generates 
G as a semigroup. We extend ip to a monoid homomorphism from S* to G by tp(w) = 
■0(°i) ' ' ' ^( a n)j if w = ai . . . a„ with a ; G S (and ?/>(e) = 1g )• The mapping ^ is called a 
semigroup presentation of G in [8]. 

The Schreier graph X = X(G, K, ip) has vertex set 

X = {Kg:ge G}, 

the set of all right i^-cosets in G, and the set of all labelled, directed edges E is given by 
E = {e = (x, a, y) : x = Kg, y = Kgip(a) , where g G G , a G £}. 

Note that the graph X is fully deterministic and uniformly connected. 

The word problem of (G, K) with respect to ip is the language 

L(G,K,ip) = {w G S* : ifj(w) G K}. 

The word problem for a recursively presented group G is the algorithmic problem of deciding 
whether two words represent the same element. Also, this terminology is used in the context 
of formal language theory and goes back at least to the seminal paper of Muller and 
Schupp [12]. For additional information, see also Muller and Schupp [13]. In their work, 
for a finitely generated group G the word problem W(G) is the set of all words on the 
generators and their inverses which represent the identity element of G. 

If we consider the "root" vertex o = K of the Schreier graph, then in the notation of the 
introduction, we have L(G, K,ip) = L , compare with [8], Lemma 2.4]. 

We can therefore apply Theorem 1 1 . 1 1 and Corollary 11.21 to the graph X(G,K,ip) in order to 
deduce that 

Corollary 2.5. The word problem of the pair (G, K) with respect to any semigroup presen- 
tation ip is growth sensitive (with respect to forbidding an arbitrary non-empty finite subset 
F c E*j. 
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